{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# [xturing](https://github.com/stochasticai/xturing) - DistilGPT-2 efficient fine-tuning tutorial\n",
    "\n",
    "This tutorial aims to show how easy it is to perform fine-tuning with xturing. If you have access to A100 80GB GPUs, we recommend you to start the LLaMA notebook. This model is much better and the results are impressive!"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. Install the `xturing` library"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install xturing --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Download and unzip the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://d33tr4pxdm6e2j.cloudfront.net/public_content/tutorials/datasets/alpaca_data.zip\n",
    "!unzip alpaca_data.zip"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Load the dataset and initialize the model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
      "  from .autonotebook import tqdm as notebook_tqdm\n",
      "\u001b[92m2023-03-27 16:00:54,235 | DEBUG | xturing.models.causal 44 | Finetuning parameters: learning_rate=0.003 gradient_accumulation_steps=1 batch_size=16 weight_decay=0.01 warmup_steps=50 eval_steps=5000 save_steps=5000 max_length=512 num_train_epochs=3 logging_steps=10 max_grad_norm=2.0 save_total_limit=4 optimizer_name='adamw' output_dir='saved_model'\u001b[0m\n",
      "\u001b[92m2023-03-27 16:00:54,236 | DEBUG | xturing.models.causal 45 | Generation parameters: penalty_alpha=0.6 top_k=0 max_new_tokens=256 do_sample=True top_p=0.92\u001b[0m\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "trainable params: 147456 || all params: 82060032 || trainable%: 0.17969283755580304\n"
     ]
    }
   ],
   "source": [
    "from xturing.datasets.instruction_dataset import InstructionDataset\n",
    "from xturing.models import BaseModel\n",
    "\n",
    "instruction_dataset = InstructionDataset(\"../llama/alpaca_data\")\n",
    "# Initializes the model\n",
    "model = BaseModel.create(\"distilgpt2_lora\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4. Start the finetuning\n",
    "\n",
    "Before starting the finetuning you can specify several parameters. In this example, the `batch_size` and the `num_train_epochs` will be modified."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Finetuning parameters: learning_rate=0.003 gradient_accumulation_steps=1 batch_size=64 weight_decay=0.01 warmup_steps=50 eval_steps=5000 save_steps=5000 max_length=512 num_train_epochs=1 logging_steps=10 max_grad_norm=2.0 save_total_limit=4 optimizer_name='adamw' output_dir='saved_model'\n"
     ]
    }
   ],
   "source": [
    "finetuning_config = model.finetuning_config()\n",
    "\n",
    "finetuning_config.batch_size = 64\n",
    "finetuning_config.num_train_epochs = 1\n",
    "\n",
    "print(f\"Finetuning parameters: {finetuning_config}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/lightning_fabric/connector.py:562: UserWarning: 16 is supported for historical reasons but its usage is discouraged. Please set your precision to 16-mixed instead!\n",
      "  rank_zero_warn(\n",
      "Using 16bit Automatic Mixed Precision (AMP)\n",
      "GPU available: True (cuda), used: True\n",
      "TPU available: False, using: 0 TPU cores\n",
      "IPU available: False, using: 0 IPUs\n",
      "HPU available: False, using: 0 HPUs\n",
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/logger_connector/logger_connector.py:67: UserWarning: Starting from v1.9.0, `tensorboardX` has been removed as a dependency of the `pytorch_lightning` package, due to potential conflicts with other packages in the ML ecosystem. For this reason, `logger=True` will use `CSVLogger` as the default logger, unless the `tensorboard` or `tensorboardX` packages are found. Please `pip install lightning[extra]` or one of them to enable TensorBoard support by default\n",
      "  warning_cache.warn(\n",
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/pytorch_lightning/trainer/configuration_validator.py:72: PossibleUserWarning: You defined a `validation_step` but have no `val_dataloader`. Skipping val loop.\n",
      "  rank_zero_warn(\n",
      "You are using a CUDA device ('NVIDIA A100-SXM4-80GB') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision\n",
      "LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]\n",
      "\n",
      "  | Name          | Type                 | Params\n",
      "-------------------------------------------------------\n",
      "0 | pytorch_model | PeftModelForCausalLM | 82.1 M\n",
      "-------------------------------------------------------\n",
      "147 K     Trainable params\n",
      "81.9 M    Non-trainable params\n",
      "82.1 M    Total params\n",
      "328.240   Total estimated model params size (MB)\n",
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:430: PossibleUserWarning: The dataloader, train_dataloader, does not have many workers which may be a bottleneck. Consider increasing the value of the `num_workers` argument` (try 12 which is the number of cpus on this machine) in the `DataLoader` init to improve performance.\n",
      "  rank_zero_warn(\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0:   0%|          | 0/813 [00:00<?, ?it/s] "
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "You're using a GPT2TokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.\n",
      "/opt/conda/envs/stochastic/lib/python3.8/site-packages/transformers/tokenization_utils_base.py:2365: UserWarning: `max_length` is ignored when `padding`=`True` and there is no truncation strategy. To pad to max length, use `padding='max_length'`.\n",
      "  warnings.warn(\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0:  75%|███████▌  | 612/813 [02:05<00:41,  4.87it/s, v_num=3, loss=0.977]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "Token indices sequence length is longer than the specified maximum sequence length for this model (1469 > 1024). Running this sequence through the model will result in indexing errors\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0: 100%|██████████| 813/813 [02:44<00:00,  4.93it/s, v_num=3, loss=0.598]"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "`Trainer.fit` stopped: `max_epochs=1` reached.\n"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 0: 100%|██████████| 813/813 [02:45<00:00,  4.92it/s, v_num=3, loss=0.598]\n"
     ]
    }
   ],
   "source": [
    "# Finetuned the model\n",
    "model.finetune(dataset=instruction_dataset)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 5. Generate an output text with the fine-tuned model\n",
    "\n",
    "You can also customize some parameters to generate text."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "GenerationConfig(penalty_alpha=0.6, top_k=0, max_new_tokens=256, do_sample=True, top_p=0.92)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "generation_config = model.generation_config()\n",
    "generation_config"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "  0%|          | 0/1 [00:00<?, ?it/s]The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.\n",
      "Setting `pad_token_id` to `eos_token_id`:50256 for open-end generation.\n",
      "100%|██████████| 1/1 [00:00<00:00,  1.22it/s]"
     ]
    },
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Generated output by the model: ['Give three tips for staying healthy.1. Whether you are new to physical activity or simply want a better life in your home.\\n2. The reasons for your lifestyle choices are far more important to you than anything else.\\n3. Know that each and every day is different, and that your body, mind, and body is something that both people and those who live in that situation have to work on.<|endoftext|>']\n"
     ]
    },
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "\n"
     ]
    }
   ],
   "source": [
    "# Once the model has been finetuned, you can start doing inferences\n",
    "output = model.generate(texts=[\"Give three tips for staying healthy.\"])\n",
    "print(\"Generated output by the model: {}\".format(output))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 6. Save your model"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Save the model\n",
    "model.save(\"./distilgpt2\")\n",
    "\n",
    "# If you want to load the model just do BaseModel.load(\"./distilgpt2\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Do you have any questions?\n",
    "\n",
    "You can open an issue in our [GitHub repo](https://github.com/stochasticai/xturing) \n"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "e426334cf9b580c18b05e507a1242529de5a6577a076046e5b890cf746ad84b5"
  },
  "kernelspec": {
   "display_name": "Python 3.8.16 ('stochastic')",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.16"
  },
  "orig_nbformat": 4
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
